The UPC Submission to the WMT 2012 Shared Task on Quality Estimation

نویسندگان

  • Daniele Pighin
  • Meritxell González
  • Lluís Màrquez i Villodre
چکیده

In this paper, we describe the UPC system that participated in the WMT 2012 shared task on Quality Estimation for Machine Translation. Based on the empirical evidence that fluencyrelated features have a very high correlation with post-editing effort, we present a set of features for the assessment of quality estimation for machine translation designed around different kinds of n-gram language models, plus another set of features that model the quality of dependency parses automatically projected from source sentences to translations. We document the results obtained on the shared task dataset, obtained by combining the features that we designed with the baseline features provided by the task organizers.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ranking Translations using Error Analysis and Quality Estimation

We describe TerrorCat, a submission to this year’s metrics shared task. It is a machine learning-based metric that is trained on manual ranking data from WMT shared tasks 2008–2012. Input features are generated by applying automatic translation error analysis to the translation hypotheses and calculating the error category frequency differences. We additionally experiment with adding quality es...

متن کامل

LIMSI Submission for the WMT'13 Quality Estimation Task: an Experiment with N-Gram Posteriors

This paper describes the machine learning algorithm and the features used by LIMSI for the Quality Estimation Shared Task. Our submission mainly aims at evaluating the usefulness for quality estimation of ngram posterior probabilities that quantify the probability for a given n-gram to be part of the system output.

متن کامل

Black Box Features for the WMT 2012 Quality Estimation Shared Task

In this paper we introduce a number of new features for quality estimation in machine translation that were developed for the WMT 2012 quality estimation shared task. We find that very simple features such as indicators of certain characters are able to outperform complex features that aim to model the connection between two languages.

متن کامل

YSDA Participation in the WMT'16 Quality Estimation Shared Task

This paper describes Yandex School of Data Analysis (YSDA) submission for WMT2016 Shared Task on Quality Estimation (QE) / Task 1: Sentence-level prediction of post-editing effort. We solve the problem of quality estimation by using a machine learning approach, where we try to learn a regressor from feature space to HTER score. By enriching the baseline features with the syntactical features an...

متن کامل

Word embeddings and discourse information for Quality Estimation

In this paper we present the results of the University of Sheffield (SHEF) submissions for the WMT16 shared task on document-level Quality Estimation (Task 3). Our submission explore discourse and document-aware information and word embeddings as features, with Support Vector Regression and Gaussian Process used to train the Quality Estimation models. The use of word embeddings (combined with b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012